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Abstract 

The neocortex has a remarkably uniform neuronal organization, suggesting that 
common principles of processing are employed throughout its extent. In particular, the 
-— h patterns of connectivity observed in the superficial layers of the visual cortex are con- 

sistent with the recurrent excitation and inhibitory feedback required for cooperative- 
competitive circuits such as the soft winner-take-all (WTA). WTA circuits offer inter- 
esting computational properties such as selective amplification, signal restoration, and 
decision making. But, these properties depend on the signal gain derived from positive 
feedback, and so there is a critical trade-off between providing feedback strong enough 
to support the sophisticated computations, while maintaining overall circuit stability. 
The issue of stability is all the more intriguing when one considers that the WTAs 
are expected to be densely distributed through the superficial layers, and that they 
are at least partially interconnected. We consider the question of how to reason about 
stability in very large distributed networks of such circuits. We approach this problem 
by approximating the regular cortical architecture as many interconnected cooperative- 
competitive modules. We demonstrate that by properly understanding the behavior 
of this small computational module, one can reason over the stability and convergence 
of very large networks composed of these modules. We obtain parameter ranges in 
which the WTA circuit operates in a high-gain regime, is stable, and can be aggregated 
arbitrarily to form large stable networks. We use nonlinear Contraction Theory to 
establish conditions for stability in the fully nonlinear case, and verify these solutions 
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available at www.mitpressjournals.org/doi/abs/10.1162/NECO_a_00091 
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using numerical simulations. The derived bounds allow modes of operation in which the 
WTA network is multi-stable and exhibits state-dependent persistent activities. Our 
approach is sufficiently general to reason systematically about the stability of any net- 
work, biological or technological, composed of networks of small modules that express 
competition through shared inhibition. 

1 Introduction 

Large biological and artificial systems often consist of a highly interconnected assembly of com- 
ponents (Fig 1). The connectivity between these elements is often densely recurrent, resulting 
in various loops that differ in strength and time-constant (Girard, Tabareau, Pham, Berthoz, & 
Slotine, 2008; Slotine & Lohmiller, 2001; Hopfield, 1982; Amari, 1977; Douglas, Koch, Mahowald, 
Martin, & Suarez, 1995; Liu, Wang, & Liu, 2006). This organization is true of the neocortex, 
where the statistics of connectivity between neurons indicate that recurrent connections are a fun- 
damental feature of the cortical networks (Douglas & Martin, 2004; Binzegger, Douglas, & Martin, 
2004; Douglas et al., 1995). These recurrent connections are able to provide the excitatory and 
inhibitory feedback necessary for computations such as selective amplification, signal restoration, 
and decision making. But, this recurrence poses a challenge for the stability of a network (Slotine 
& Lohmiller, 2001; Tegner, Compte, & Wang, 2002; Cohen & Grossberg, 1983). Connections may 
neither be too strong (leading to instability) or too weak (resulting in inactivity) for the network 
to function properly (Koch & Laurent, 1999). In addition connections are continually changing 
as a function of learning, or are accumulated semi-randomly throughout development or evolu- 
tion. How then, do these networks ensure stability? Artificial neural networks can rely on their 
bounded (e.g. sigmoid) activation functions, but biological neurons do not usually enter satura- 
tion. Instead, their stability depends crucially on the balance between inhibition and excitation 
(Hahnloser, Sarpeshkar, Mahowald, Douglas, & Seung, 2000; McCormick & Contreras, 2001). In 
this paper we explore how the stability of such systems is achieved, not only because we wish to 
understand the biological case, but also because of our interest in building large neuromorphic 
electronic systems that emulate their biological counterparts (Indiveri, Chicca, & Douglas, 2009). 

Reasoning about the computational ability as well as the stability of neural systems usually 
proceeds in a top-down fashion by considering the entire system as single entity able to enter many 
states (as for example in Hopfield networks (Izhikevich, 2007; Hopfield, 1982; Hertz, Krogh, & 
Palmer, 1991)). Unfortunately, the number of states that must be considered grows exponentially 
with the size of the network, and so this approach quickly becomes intractable. For this reason 
stability analysis of large-scale simulations of the brain are proving difficult (Izhikevich & Edelman, 
2008; Ananthanarayanan, Esser, Simon, & Modha, 2009; Markram, 2006). 

We present an alternative approach, which uses bottom-up reasoning about the modules that 
constitute the network. The idea is that the stability of the modules should be conferred on 
the networks that they compose. Of course, simply combining several modules, each of which 
is stable in isolation, to form a larger system does not necessarily imply that the new system is 
stable (Slotine & Lohmiller, 2001; Slotine, 2003). However, we explore the possibility that when the 
modules employ a certain kind of stability mechanism, then they are indeed able to confer stability 
also on the super-system in which they are embedded. We show that modules that achieve their 
own stability by observing constraints on their inhibitory/excitatory balance, can be stable alone 
as well as in combination. 

We have chosen to examine this problem in networks of WTA circuits (Yuille & Geiger, 2003), 
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because these circuits are consistent with the observed neuroanatomical connections of cortex 
(Douglas & Martin, 2004; Binzegger et al., 2004). Moreover, the WTA is interesting because it 
can implement useful computational operations such as signal restoration, amplification, max-like 
winner selection (i.e. decision making) or filtering (Maass, 2000; Hahnloser, Douglas, Mahowald, 
& Hepp, 1999; Douglas & Martin, 2007; Yuille & Geiger, 2003). And, combining multiple WTAs 
in a systematic manner extends these possibilities further by allowing persistent activity and state- 
dependent operations (Rutishauser & Douglas, 2009; Neftci, Chicca, Indiveri, Slotine, & Douglas, 
2008; Neftci, Chicca, Indiveri, Cook, k Douglas, 2010). 

Typically, WTA networks operate in a high-gain regime in which their operation is non-linear 
(e.g. selective amplification). While the stability of a WTA can be analyzed by linearizing around 
the possible steady-states, rigorous analysis that takes the non-linearities into account is difficult 
using linear analysis tools (Strogatz, 1994; Izhikevich, 2007; Hahnloser, 1998; Hahnloser, Seung, & 
Slotine, 2003). Instead, we use nonlinear Contraction Analysis (Lohmiller & Slotine, 1998; Slotine, 
2003; Lohmiller & Slotine, 2000) to investigate the stability of WTA networks. The concept 
of contraction is a generalization of stability analysis for linear systems, allowing Contraction 
Analysis (Lohmiller & Slotine, 1998) to be used for the analysis of circuits in the fully-non linear 
case, without making linearized approximations. 

A nonlinear time-varying system is said to be contracting if initial conditions or temporary 
disturbances are forgotten exponentially fast. Thus, any two initial conditions will result in the 
same system trajectory after exponentially fast transients. Importantly, the properties of con- 
tracting systems are preserved when they are combined to form a larger systems (Slotine, 2003). 
Also, contraction allows parameter regimes which are not unduly restrictive. For instance, it can 
describe strong feedback loops; and, ranges of parameters can be found where the system is both 
contracting and operating in a highly non-linear regime. In addition, contraction analysis can 
deal with systems that are multi-stable (expressing several stable attractors or behaviors), where 
it guarantees exponentially fast convergence to one of the possible attractors. Such systems are 
capable of rich state-dependent computations, while at the same being contracting. We have used 
Contraction Analysis to reason about the permissible kinds and strengths of connectivity within 
and between WTA modules embedded in a network. If the individual modules are contracting, 
then observing our constraints is sufficient to guarantee stability (boundedness) of a system com- 
posed of such modules. Thus, Contraction Analysis permits the derivation of simple bounds on 
the network parameters that will guarantee exponential convergence to equilibria in the fully non- 
linear case. This approach enables the systematic synthesis of large circuits, which are guaranteed 
to be stable if the set of bounds is observed. While we will demonstrate the feasibility of our 
approach in the case of WTA-type networks, our approach is not restricted to such networks. It 
can be applied as well to any simple non-linear circuit that is capable of non-linear computational 
operations. 

2 Results 

Our results are organized as follows. First, we introduce the basic organization of the WTA circuit. 
Second, we apply contraction theory to analyze the stability of networks of WTA circuits. We 
derive analytically the bounds on the parameters of the network that permit it to operate properly 
in either a soft-or hard WTA configuration. We conclude by performing numerical simulations to 
confirm that the analytical bounds are valid and not unnecessarily restrictive. 
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Figure 1: Illustration of the problem. The two arrows at the bottom represent external input 
whereas all other connections are internal and excitatory. Shown is a recurrently connected system 
composed of 5 modules (each of which is a recurrent network) . Given properties of the modules 
alone, can we guarantee the stability of the large connected system ? What constraints does each 
module have to observe for this to be true? 
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2.1 The winner-take all network 



Each winner-take all (WTA) x consists of N — 1 excitatory units x±„n-i and one inhibitory unit 
xn (See Fig 2A). Each excitatory unit receives recurrent input from itself (a^) and its neighbors 
( Q; 2,3,...)- For simplicity, only self-recurrence is considered here (0:2,3,... = 0), but similar arguments 
obtain when recurrence from neighboring units is included (see section 2.6). The inhibitory unit 
receives input from each excitatory unit with weight P2, and projects to each excitatory unit with 
weight (3\. The dynamics of each unit are described by Eqs 1 and 2. The firing rate activation 
function f(x) is a non-saturating rectification non-linearity max(0,x). The dynamics of this net- 
work, and in particular the boundedness of its trajectories, depends on the balance of excitation 
and inhibition. 

TXi + Gxi = f(Ii + axi - (3ix N - Ti) (1) 

N-l 

rx N + Gx N = /(& x j - t n) (2) 
j=i 

Where Ii(t) is external input to unit i. All thresholds Tj > are constant and equal. G > is 
a constant that represents the load (conductance) and is assumed G — 1, unless stated otherwise. 
All parameters are positive: a > 0, (5i > 0, /3 2 > 0. We will refer to such a system either as a WTA 
or a "recurrent map" throughout the paper. "Map" will denote a WTA throughout, and not a 
discrete dynamical system. 

2.2 Combining several WTAs 

A single WTA network can implement some useful computational operations (see Introduction). 
However, more sophisticated computational operations can be achieved by combining several 
WTAs (Rutishauser & Douglas, 2009) by sparse and selective connections between some of the 
excitatory units of the various WTAs. We consider two ways of combining WTAs: bidirectional 
and unidirectional. A bidirectional (and symmetric) connection establishes a recurrent connection 
between two WTAs. A unidirectional connection provides the activity of one WTA as input to a 
second WTA (feed-forward). The inhibitory units neither receive input from, nor do they project 
to, any other map. Thus, activity between maps is always excitatory (positive). This arrangement 
is motivated by the long range connections in cortex, which are predominantly excitatory (Douglas 
& Martin, 2004; Douglas et al., 1995) (but they can contact both excitatory and inhibitory targets). 
While long-range inhibitory projections in cortex exist as well, we focus exclusively on excitatory 
long-range connectivity in this paper. 

These two kinds of connectivity are motivated by our previous finding that three WTAs con- 
nected by a combination of bi-and unidirectional connections are sufficient to implement state- 
dependent processing in the form of an automaton (Rutishauser & Douglas, 2009). An automaton 
consists of two components: states, and transitions between states. By connecting two maps 
bidirectionally, the network is able to maintain one region of persistent activity in the absence of 
external input, and this winning region represents the current state of the automaton. (State de- 
pendence is a form of memory and we thus refer to these localized regions of persistent activity as 
memory states.) Transition circuits allow the network to select a new winner, conditioned on the 
current state as well as an external input. The implementation of these transitions requires a third 
WTA (to select the most appropriate transition) as well as unidirectional connections between the 
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Figure 2: Illustration of connectivity and notation. Excitatory and inhibitory connections are 
denoted by straight and dashed lines, respectively (A) Single WTA with all connections shown 
with respect to £3. (B) Simplified version with N = 3 units per map and a 2 = and 0*3 = 0. (C) 
Combination of two WTAs by symmetric bidirectional connection 7. (D) Network comprising 3 
WTAs x, y and z; connected by 7 and 4> connections. The network has two states (each represented 
by one unit on maps x and y) and two transition units Z\ and z 2 . External input Ii(t) or I 2 (t) to 
either z\ or z 2 signals the arrival of a symbol to be processed by the network by executing a state 
dependent transition. If the network is in state 1, activation of z\ initiates a transition from state 
1 to 2. If the network is in state 2 and z 2 becomes active, the network remains in state 2 (a loop, 
see text). The local wiring on each WTA is not shown, but is equivalent to the connectivity of 
(B). 
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maps that drive the transition (see below). In this paper we explore what constraints the presence 
of these additional connections poses on the stability of this and larger (more than three WTAs) 
networks. 

First, consider two identical WTAs x and y (see Fig 2C). Each WTA consists of N = 3 units (2 
excitatory, 1 inhibitory). The only connection between the two networks is 7, which symmetrically 
(bidirectional) connects x 2 and y 2 . Thus, this network can only represent one state. 

The update equations for x 2 and y 2 thus become: 

rx 2 + Gx 2 = f(h + olx 2 + 71/2 - Pix N - T) (3) 

rm + Gy 2 = f(ay 2 + 7x2 - Piy N - T) (4) 

Second, we consider unidirectional connections between WTAs. These are feed-forward con- 
nections between two maps: For example, when units on map x provide input to units on map 
z. However, such feed-forward connections can result in (indirect) recurrence: For example, when 
map z in turn provides input to x. Thus, analysis of unidirectional connections requires that we 
consider three maps x,y and z simultaneously. The two maps x and y are connected bidirectionally 
as shown above, whereas z contains units that receive external input as well as input from y and 
also provide output to x (Fig 2D). In this way, strong enough activation of units on z can bias 
the ongoing competition in the network and thereby induce a switch to a new winner (so changing 
state). 

A given unit on z can either receive input from a different unit than it projects to (so providing 
a transition from one state to an other); or it can receive from and project to the same state. In 
Fig 2D, z\ is an example of a unit that initiates a transition from state 1 to 2, whereas z 2 receives 
input from and projects to state 2. Thus, z 2 establishes an additional loop of recurrent feedback 
and is the more restrictive case when considering stability. 

Following Fig 2D, the dynamics of X\ and x 2 become 

TXx +xi = f(h + axi + 71/1 - PiX N - T) (5) 

tx 2 + x 2 = f(I 2 + ax 2 + 7|/ 2 + (j)Zi + <pz 2 - p x x N - T) (6) 

and similarly for yi, y 2 . 

The dynamics for the two new units Z\ and z 2 are 

TZ X + Zi = f(I T Nl + OLZx + 0yi - PiZ N - T - T TN ) (7) 

tz 2 + z 2 = f(I TN2 + az 2 + <py 2 - Piz N - T - T TN ) (8) 

The equations for the other units of the system are equivalent to the standard WTA. 

The term Ttnj is an additional constant threshold for activation of the transition unit, so that 
in the absence of an external input Itnji the transition unit will remain inactive Zj = 0. The 
external input Itni can be used to selectively initiate a transition. An appropriate choice of the 
threshold Ttnj will ensure that the transition unit Zj is active only when both the external input 
Itni > and the input from the projecting map yj<f) > are present. The activation of Zj is thus 
state dependent, because it depends both on an external input as well as the current winner of the 
map. 

Now we will explore what constraints the presence of 7 > and > impose on stability. We 
will use Contraction Analysis to show that, if the single WTAs are contracting, 7 and <fi can be 
used (with an upper bound) to arbitrarily combine WTAs without compromising the stability of 
the aggregate system. Since we base our arguments on contraction analysis, we will first introduce 
its basic concepts. 
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2.3 Contraction Analysis 

Essentially, a nonlinear time-varying dynamic system will be called contracting if arbitrary ini- 
tial conditions or temporary disturbances are forgotten exponentially fast, i.e., if trajectories of 
the perturbed system return to their unperturbed behavior with an exponential convergence rate. 
It turns out that relatively simple algebraic conditions can be given for this stability-like prop- 
erty to be verified, and that this property is preserved through basic system combinations and 
aggregations. 

A nonlinear contracting system has the following properties (Lohmiller & Slotine, 1998, 2000; 
Slotine, 2003; Wang & Slotine, 2005) 

• global exponential convergence and stability are guaranteed 

• convergence rates can be explicitly computed as eigenvalues of well-defined Hermitian ma- 
trices 

• combinations and aggregations of contracting systems are also contracting 

• robustness to variations in dynamics can be easily quantified 

Before stating the main contraction theorem, recall first the following. The symmetric part of 
a matrix A is A# = |(A + A* T ). A complex square matrix A is Hermitian if A T = A* , where T 
denotes matrix transposition and * complex conjugation. The Hermitian part A H of any complex 
square matrix A is the Hermitian matrix |(A + A* T ) . All eigenvalues of a Hermitian matrix 
are real numbers. A Hermitian matrix A is said to be positive definite if all its eigenvalues are 
strictly positive. This condition implies in turn that for any non-zero real or complex vector x, 
x* T Ax > 0. A Hermitian matrix A is called negative definite if —A is positive definite. 

A Hermitian matrix A(x, t) dependent on state or time will be called uniformly positive definite 
if there exists a strictly positive constant such that for all states x and allt > the eigenvalues of 
A(x, t) remain larger than that constant. A similar definition holds for uniform negative definite- 
ness. 

Consider now a general dynamical system in R™, 

x = f(x,t) (9) 

with f a smooth non-linear function. The central result of Contraction Analysis, derived in (Lohmiller 
& Slotine, 1998) in both real and complex forms, can be stated as: 

Theorem Denote by J£ the Jacobian matrix of f with respect to x. Assume that there exists 
a complex square matrix ®(x,t) such that the Hermitian matrix &(x, t)* T @(x, t) is uniformly 
positive definite, and the Hermitian part Fh of the matrix 

F= (e + eg)e- 

is uniformly negative definite. Then, all system trajectories converge exponentially to a single 
trajectory, with convergence rate | sup xt A max (F#)] > 0. The system is said to be contracting, F is 
called its generalized Jacobian, and &(x, t)* T @(x, t) its contraction metric. The contraction rate is 
the absolute value of the largest eigenvalue (closest to zero, although still negative) A = |A mox Fjy|. 

In the linear time-invariant case, a system is globally contracting if and only if it is strictly 
stable, and F can be chosen as a normal Jordan form of the system, with © a real matrix defining 
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the coordinate transformation to that form (Lohmiller & Slotine, 1998). Alternatively, if the 
system is diagonalizable, F can be chosen as the diagonal form of the system, with © a complex 
matrix diagonalizing the system. In that case, F# is a diagonal matrix composed of the real parts 
of the eigenvalues of the original system matrix. 

Note that the activation function f(x) = max(x,0) (see Eqs 1-2) is not continuously differen- 
tiable, but it is continuous in both space and time, so that contraction results can still be directly 
applied (Lohmiller & Slotine, 2000). Furthermore, the activation function is piecewise linear with 
a derivative of either or 1. This simple property is exploited in the following by inserting dummy 
terms lj, which can either be or 1 according to the derivative of f(x): lj = ^f(xj(t)). For a 
single WTA, there are a total of N dummy terms. 



2.4 Stability of a single WTA 



We begin the contraction analysis by considering a single WTA. The conditions obtained in this sec- 
tion guarantee that the dynamics of the single map converge exponentially to a single equilibrium 
point for a given set of inputs. Actually, the WTA has several equilibrium points (corresponding 
to each possible winner), but contraction analysis shows that for a given input a particular equilib- 
rium will be reached exponentially fast, while all others are unstable. Thus, as long as the network 
does not start out exactly at one of the unstable equilibria (which is impossible in practice), it 
is guaranteed to converge to the unique equilibrium point (the winner) determined by the given 
set of inputs. Our strategy is two-fold: first we show that the WTA is contracting only if one of 
the excitatory units is active (the "winner" in a hard- WTA configuration). Second, we show that 
in the presence of multiple active excitatory units, the dynamics diverge exponentially from the 
non-winning states. 

Following section 2.3, a system with Jacobian J is contracting if 



r© J 1 < 



(10) 



The Jacobian J has dimension N and describes the dynamics of a single WTA, and © is a trans- 
formation matrix (see section 2.3 and below). Using dummy terms lj as shown in the previous 
section, the Jacobian of the WTA is 



rJ 



ha — G 






l 2 a — 



G 



-G 



(11) 



This WTA has two possible winners (xi or x 2 ) that are represented by li = l,/ 2 = or 
h — 0, 12 — 1, respectively (Z 3 = 1 for both). Assuming the second unit is the winner, the Jacobian 
becomes 



tJ W2 = 



-G 
a-G 





-A 
-G 



(12) 



Our approach consists in first finding a constant metric transformation © describing the contraction 
properties of the simple Jacobian (12) for appropriate parameter ranges, a process equivalent to 
standard linear stability analysis, and then using the same metric transformation to assess the 
contraction properties of the general nonlinear system. 
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Let us first find ranges for the parameters a, /?i , /3 2 such that Jw2 is contracting. This is the case 
if r@ 3w2 © 1 < 0, where defines a coordinate transform into a suitable metric. The left hand- 
side is the generalized Jacobian F = ©J^®" 1 (see section 2.3). Based on the eigendecomposition 
3w2 — QAQ \ where the columns of Q correspond to the eigenvectors of Jw2, define © = Q _1 . 



This transformation represents a change of basis which diagonalizes F (Horn, 1985). This choice 
of a constant invertible © also implies that ©* T © is positive definite (since x* T @* T @x = ||@x|| 2 , 
Vx). 

Using this transformation and assuming G — 1, the Hermitian part of F (Eq 10) is negative 
definite if 1 

0<a<2^/M~2 (13) 



0</3 2 
< ftft < 1 



(14) 
(15) 



Note that these conditions systematically relate a to the inhibitory loop gain (3i(3 2 , and also permit 
a > 1 (see below for discussion). 

The above conditions guarantee contraction for the cases where inhibition (/ 3 = 1) and one 
excitatory unit are active (here l 2 — 1 and l± — but the same bounds are valid for l 2 — and 
li = 1). The next key step is to use the same metric to study arbitrary terms ^,3 and l\ = 0, so as 
to show that the system is contracting for all combinations of ^,3, except the combinations from 
which we want the system to be exponentially diverging. In the same metric © and using the 
Jacobian Eq 11 with li — the Hermitian part of F becomes (with i 2 = —1) 



H 



1 










-1 + \od 2 

{-2ip 1 p 2 +a{ia+y/-a 2 +APiP2))(l2-h) 
2 v /- a 2+4/3 1 /3 2 







{2ip 1 p 2 +<x{-i<x+y/-<x 2 +Wifo)){h-h) 
2y/-a 2 +Ap 1 p 2 

-1 + \cd 2 



(16) 



Note that Eq (16) was simplified assuming the bound given in Eq 13. We require F H < 0. A 



matrix of the form 



Ax r 
r* \ 2 

2005). For (16), this results in 



is negative definite if Aj < and \r\ 2 < AiA 2 (Wang & Slotine, 



-a 2 + 4/5x/3 2 



< 



1 

2"' 



-1 + >/ 2 ) 2 



(17) 



The bounds (13)-(15) on the parameters satisfy this condition whenever l 2 = l 3 and l 2 — 0, l 3 — 1. 
As expected, for the case l 2 — 1,Z 3 = (only excitation active) the system is not contracting for 
a > 1. Rather, we require that in this case the system is exponentially diverging, as we detail 
below. 

Next, we consider the full Jacobian (Eq 11) with all Z 12 ,3 = 1- For the network to be a hard- 
WTA, we require that this configuration is exponentially diverging. The dynamics of interest 
are the excitatory units, so that, following (Pham & Slotine, 2007), the system is exponentially 
diverging away from this state if 

VJV T > (18) 



^hese solutions are derived by considering the eigenvalues of the Hermitian part of (10), which 
is diagonal and real, and then solving the system of inequalities \ ma x < 0. 
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Figure 3: Illustration of the different operating conditions as a function of the excitatory strength 
a and the inhibitory loop gain ftft. Note the transition from a soft-WTA to hard-WTA at a = 1. 



where V is the projection matrix 



V 



a -ft 
a -ft 



(19) 



is constant. For V as shown in (19), Vx 



Each row represents one excitatory 



The constraint (18) assures that the system diverges from certain invariant subspaces where Vx 

axi — ftx 3 
ax 2 - ftx 3 

unit. If condition (18) is satisfied, the network is guaranteed to diverge exponentially away from 
this equilibrium. 

Condition (18) is satisfied (for G = 1) if 



1 < a 



0<ft 
< ftft < (1 



1 oi 

> 2 + T 



(20) 
(21) 
(22) 



The above conditions were derived based on the system of inequalities X m i n > given by the 
eigenvalues of the Hermitian part of the left-hand side of (18). The same calculation using instead 
lx t 2 = 1 and l 3 = (excitation, but no inhibition) results in the same bounds for exponential 
divergence from the state of no inhibition. 

Combining (i) conditions (13)-(15) for exponential convergence to the winner state and (ii) con- 
ditions (20)-(22) for exponential divergence from the non-winning and the excitation-only states, 

yields 

1< a < 2v/ftft (23) 

7 < < 1 ( 24 ) 



a I 



(25) 



Note the two key components: the excitatory gain a and the inhibitory gain ftft. The above 
conditions establish lower and upper bounds on the parameters for global exponential convergence 
to a unique winner for a given set of inputs. 
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Under these constraints (in particular on the excitatory loop strength a) the system is globally 
convergent yet always selects a winner. The system does not depend on saturation to acquire this 
stability. Also, the constraints guarantee that the system does not oscillate, apart from transient 
oscillations during convergence. This has been established by demonstrating that the system is 
either contracting or exponentially diverging for any subset of the dummy terms ^1,2,3- Note that 
the system is contracting in the same metric for all contracting subsets. While we defined 
the metric © for a particular winner, the same constraints result from defining a similar for 
any of the other possible winners. Similar conditions can be derived for conditions where the 
winner is represented by multiple active units such as when a "bump of activity" is introduced 
by adding excitatory nearest-neighbor connections a 2 (Rutishauser & Douglas, 2009; Douglas & 
Martin, 2007) (see section 2.6). Numerically, these ranges permit a wide range of parameters. For 
example, for (3i = 3 and $2 = 0.3, 1 < a < 1.89. Under these conditions, the system operates in a 
highly non-linear regime (where the loop gain can be up to 50!). 

The analysis above focused on the regime where a > G (with G = 1). In this mode, the system 
acts as a highly non-linear WTA, always selecting a binary winner. What if the system operates 
in a < 1 ? In this configuration, the winner unit is still contracting (Eq 13). 

What happens when all units (^1,2,3 = 1) are active and a < 1? Defining based on the 
Jacobian J with all units on and solving 0J0 1 < 0, we find that this system is contracting for 
a < 1. The system where all excitatory units are active is thus contracting under this condition, 
implying that the system is in a "soft- WTA" configuration. While the system still selects a 
winning unit, the activity of the loosing unit is not completely suppressed. Also note that in 
this configuration, no persistent activity in the absence of external input is possible. A graphical 
illustration of both modes of operation is shown in Fig 3. 

Finally, note that the time-constant r was assumed to be equal for all units. Note that, in this 
case, the numerical value of r does not influence the bounds (since r > multiplies the entire 
Jacobian, see Eq 10). Similar conditions can be derived for conditions where the time-constants 
are not equal (see Appendix C), in which case only the ratio of the time-constants is relevant. 

2.5 Stability of single WTA of arbitrary size 

Can this analysis be extended to maps of arbitrary size? While the approach in the previous 
section can be applied to maps of any size, an alternative approach is to first define contraction for 
a map consisting only of a single excitatory and inhibitory unit and then extend it recursively by 
one unit at a time, while showing that this extension does not change the contraction properties. 
This approach is illustrated in Fig 4. 

The most simple map consists of one excitatory and one inhibitory unit (Fig 4A) . While there 
is no competition between different inputs, this map otherwise preserves all the properties of a 
WTA (such as non-linear amplification of the input). The Jacobian of this map is: 

(26) 

This system (Fig 4A) is contracting if the conditions shown in Eqs 23-25 for the parameters 
a, Pi, 02 hold. The approach used to derive the bounds is equivalent to the one described above: 
first, define a = Q _1 , where Q is based on the eigendecomposition of A with / 12 = 1. Then, 
define the valid parameter ranges based on Eq 10. The same permissible parameters result (see 
Eqs 13, 15, 14). 
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Figure 4: Illustration of combining a simple WTA to form a bigger WTA . White units are 
excitatory, gray units inhibitory. (A) Simple map consisting of one excitatory unit X\ and one 
inhibitory unit x^. (B) Combining two identical copies of the map shown in (A) by providing 
excitatory input /3 2 from each excitatory unit to both inhibitory units. This new map, consisting 
of two excitatory units, is functionally equivalent to a WTA with two excitatory and one inhibitory 
units. The two inhibitory units x 2 and £4 have, at all times, the same level of activity. The same 
principal can be extended to form maps of arbitrary size. 



Combining two such maps by feeding excitatory input to both inhibitory neurons by both 
excitatory neurons leads to a WTA with two excitatory units (Fig 4B). This map is equivalent to 
the map shown previously, except for that it contains two inhibitory neurons. These are, however, 
functionally equivalent (their activity and derivatives are the same at all points of time). Thus the 
behavior of both systems will be equivalent. The Jacobian of the combined system is: 



rJ 



Ax d 
G 2 A 2 



where 



rG 





I2P2 



(27) 



(28) 



and A 12 = A after adjusting the k terms appropriately (Zi )2 and / 3j4 for Ai and A 2 , respec- 
tively). Similarly, Gi )2 = G for l 2 and I4, respectively. Note that combining the two systems in 
this way adds only two (strictly positive) terms to the equations describing the dynamics of the 
inhibitory neurons. Thus, inhibition in this new system can only be larger compared to the smaller 
system. Thus, if the smaller system is contracting (as shown above), the combined system must 
also be contracting (shown in the next section). 

Defining a metric based on the eigendecomposition of J for either / lj2 4 = 1 and l 3 = or 
^1,3,4 — 1 an d ^2 = and then solving 



r0 J < 



(29) 



results in the same constraints for the system to be contracting (see Eqs 13, 15, 14). 

This result can be generalized so that it is valid for adding one unit to a map that is already 
contracting. This can be seen directly by considering the eigenvalues of the Hermitian part of 
F = 0J0 defined either for a system with n units or n + 1 units. A system with n = 1 units 
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has Jacobian Ai and is contracting as shown previously. The condition for it to be stable F s < 
requires (for the real part only) 



\{-2 + a± - 4fhfo) < 



A system with n = 2 units has Jacobian J (Eq 27) and is stable if (29) holds. This requires 



\{-2 + a ± Va 2 - < 



(30) 



(31) 



Comparing Eqs 30 and 31 reveals that adding a unit n+1 to a system of n units does not change the 
conditions for contraction to a single winner. Thus, if the recurrent map consisting of n excitatory 
unit is contracting the system of n + 1 units is also contracting. By recursion this proof can be 
applied to maps of arbitrary size. 

What if multiple units on the map are active? Above conditions show that a single winner is 
contracting on an arbitrary sized map. In a hard-WTA configuration, the system should always 
emerge with a single winner. We have previously shown that our system has this property when 
a > 1 (see Eq 18). Here, we extend this argument to maps of arbitrary size. Note that only the U 
for the excitatory units can be switched inactive. All inhibitory neurons (since they all represent 
the same signal) are always k = 1. 

Here, we start with a system that has n = 2 units (since a system of n — 1 does not have 
competition). The goal is to find conditions that enforce a single winner for n = n + 1 units. For 
the n = 2 system (J with all £1,2,3,4 = 1), enforcing VJV T > (see Eq 18) with 



V 



a 




-0i 

-0i a 



-0i 
0i 



(32) 



gives conditions for this configuration (both units on, i.e. h = I3 = 1) to be exponentially unstable 
(thus converging to an other subset of the terms). Similar to (19), the system diverges from 
invariant subspaces where Vx is constant. For the projection (32), ax\ — 0\x 2 — 0ix± = and 
axo, — 0\X2 — 0iX4 = defines the equilibrium. If condition (18) is satisfied, the network is 
guaranteed to diverge exponentially away from this equilibrium. 

The eigenvalues of the Hermitian part of this system (same as for Eq 18) are uniformly positive 
if the following two conditions hold 



-a 



+ a 3 > 0, 



-a 



a 3 - 0l + 2a0l - 4a0 1 2 > 



(33) 



Note that any solution requires a > 1 (solutions are shown in (20)-(22)). This condition thus 
shows that any two simultaneously active units can not be contracting if a > 1. 
For the 3 unit system, applying (27) recursively results in the Jacobian 



rJ = 



Ai Gx d 
G2 A2 G2 
G3 G3 A3 



(34) 



Applying an appropriate V constructed in analog to (32) shows that VJV T > for this system if 

-a 2 + a 3 > 0, -a 2 + a 3 - 6/3? + 3a0 2 - 9a0!0 2 > (35) 
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Note that a sufficient solution continues to require a > 1. We have thus shown, under a > 1 
that a system with n = 2 as well as n = 3 can only have one active unit. By recursion, the same 
argument can be used to show that any system n = n + 1 can not have a subset of % units (where 
\ < % <= n) active. Any such system is thus always converging to a single winner. Any such 
system will have these properties if the parameters are within the ranges shown in Eqs 20, 21, 22) 
hold. 

For purposes of this proof, we used additional inhibitory units (one for each excitatory unit). 
Note that this arrangement is for mathematical convenience only: In an implemented system these 
units can be collapsed to one unit only (or several to implement local inhibition). Collapsing all 
units to one does not change the dynamics of the system, because all inhibitory units have the 
same activity (and its derivatives) at all times. 



2.5.1 Example 

This example will show how to apply the approach outlined above in order to calculate the permis- 
sible range of parameters for a toy recurrent map consisting of one excitatory and one inhibitory 
unit (Fig 4A), whose Jacobian is A (Eq 26) with l ii2 = 1. Our intention is to illustrate in detail 
the procedural aspects involved in the calculation. 

First construct Q based on the eigenvectors of A and then set 



-4/31/32 
a 



= Q 1 = V" 2 - 4 /^ : 

fe 

y/oP-ifofa 2 2^2-4/31/32 

Then, transforming A using © results in the generalized Jacobian 



(36) 



F = r© A © 1 



-2 + a 



V 7 ^ 




4/?iA 



-2 + a + y/a 2 - 



(37) 



Due to the choice of the metric G, only terms on the diagonal remain. The network is contract- 
ing if the Hermitian part = |(F + F* T ) (37) is negative definite. A sufficient condition for this 
to be the case is Re(X m i n (F s )) < 0. Solving this system of inequalities results in the conditions 
shown in (13, 14, 15). 



2.5.2 Comparison with numerical simulations 

Do the analytical bounds derived above match the behavior of the system when it is simulated? 
We simulated a WTA network as described (with 2 excitatory units) and systematically tested 
different combinations of the parameters a, f3±, fa- For each simulation we determined whether all 
units in the network reach steady state with Xi = after a sufficient amount of time. Such networks 
where classified as stable or unstable, respectively (Fig 5). Here, we vary a and /3i while keeping 
02 = 0.25. While the analytically derived solution is slightly more conservative than necessary 
it closely matches the results of the simulations (see Fig 5 and legend for details). The crucial 
parameter is the excitatory strength relative to the inhibitory strength. This can be seen from the 
general increase of the permissible value of « as a function of Pi (Fig 5). Note however that our 
analytical solution assigns an upper bound to (3\ as well, which is unnecessary for the numerical 
simulations. However, strong values of (3i lead the system to oscillate and keeping the parameter 
within the range derived analytically prevents this problem. 
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Figure 5: Comparison of the permissible parameter range derived analytically, with that obtained 
by simulation. Results are shown as a function of a and (3i, while holding /3 2 = 0.25 (constant). 
The region of contraction is indicated in light blue. The upper boundary is 0i < j^. The right 
boundary is the upper-bound on excitation, a < 2^J fiifa- Simulation results are indicated by 
colored dots, where a red dot indicates success (contraction) and gray failure. For convenience, 
only the range a > 1, fi\ > 1 is shown here. 
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2.6 Stability of single WTA - bump of activity 



The previous analysis considered WTA networks where only <y,\ = a > and 0-2 = (Fig 2A). In 
this configuration, the winner of the competition is represented by a single active unit. However, 
cooperation between neighboring units can also be introduced, by setting < oli < a±. The winner 
is now represented by a more distributed "hill of activity". Our analysis can also be extended to 
this case. 

For the simplest case of 2 units, this network has the Jacobian 



rJ 



lioci — G 

l 2 CH 2 



ha 2 
l 2 a\ — G 



-G 



(38) 



with Zi )2 ,3 = 1. Using the approach outlined previously, this system is stable if 0J0 -1 < 0. After 
applying the coordinate transform, examining the eigenvalues of the Hermitian part of this 
system reveals that 

(39) 



1 

2' 



-2 + aii + a 2 ± a/ (aii + «2p 



ifo) < 



is a required condition (plus others, not shown). Comparing this condition to the eigenvalues of 
the system with 0-2 = (see (31)) reveals that a was replaced by «i + 0-2 (plus some other minor 
modifications). This result confirms the intuition that the crucial factor is the total excitatory 
input ai + a 2 to any one unit. A sufficient condition for this system to be contracting is (compare 
to Eq 13) 



< «i + a 2 < 



(40) 



This condition applies as long as a\ + a 2 < 2 and a,\ — a 2 < 1. Together, these conditions similarly 
permit a fairly wide range of parameters, including ct\ > 1. For example, if /3i — 3, f3 2 — 0.3 and 
a 2 = 0.5, «i < 1.5. Note the critical trade-off between the inhibitory gain (3i(3 2 and the excitatory 
gain ai + a 2 that is expressed in this section. 



2.7 Stability of two bidirectionally coupled WTAs 

Next we consider how two WTAs x and y can be coupled stably (by 7 connections as shown above). 
The key idea is first to give sufficient conditions for stable synchronization of the two WTA's. Note 
that by synchronization we mean here that two variables have the same value (in contrast to other 
meanings of synchronization i.e. in population coding). This allows the dimensionality of the 
stability analysis to be reduced. Indeed, synchronization implies that the overall system stability 
can then be analyzed simply by considering the stability of the individual target dynamics, i.e., 
of any one of the subsystems where the external coupling variables have been replaced by the 
corresponding (endogenous) variables in the subsystem. For instance, in the target dynamics of x, 
equation (3) is replaced by 

tx 2 + Gx 2 = f(I 2 + ax 2 + 7^2 - PiX N - T) (41) 

Next, we shall see that in fact, given the form of coupling we assume, stable synchronization of the 
subsystems comes "for free". That is, it is automatically satisfied as long as sufficient conditions 
for the stability of the individual target dynamics are satisfied. 
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Following (Pham & Slotine, 2007), synchronization occurs stably if the following holds: 



VJV J < 



(42) 



where 



\=[I N -I N ] 



(43) 



and J is the Jacobian of the entire system. Here, we define synchrony as equal activity on both 
maps, i.e. Xi = for all i. This condition is embedded in V as shown. Note that the system 
need not start out cts %i — y>i to begin with but rather the condition embedded in V guarantees 
that the system will converge towards this solution. Other conditions of synchrony (such as only 
some neurons synchronizing) can similarly be specified by modifying V accordingly. V specifies a 
metric which is orthogonal to the linear subspace M in which the system synchronizes (i.e. 
a flow-invariant subspace, see Theorem 3 in (Pham & Slotine, 2007)). 

The Jacobian J has dimension 2N and is composed of the two sub-Jacobians J 1 and J 2 (as 
shown in Eq 11), which describe a single WTA, and of the Jacobians of the couplings. 
Introducing the coupling term 

~ 7 " 

C = 7 (44) 




results in the Jacobian J of the full system: 



J 1 C 
C J 2 



which can be written, using again dummy terms L, as 
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The above expression yields: 
r VJV T = 



(46) 



(Zi + Z 4 )(a-7) 


Hh + h) 



2G -Pi(h + h) 

(h + h)(a-l)-2G -^(h + h) 
(3 2 (l 3 + h) -2G 



(47) 



Note that a > 0, ft > 0, (3 2 > 0. 

Consider now the Jacobian of e.g. subsystem-1 once synchronized, i.e., with the coupling terms 
from subsystem-2 variables replaced by the same terms using subsystem-1 variables (this is what 
we called earlier the target subsystem-1). Given equation (11) and (41), this Jacobian can be 
written 

h(a + i)-G -h^ 

rJL,„= l 2 {a + i)-G -hpi (48) 

I3P2 hP2 ~G 



sync 
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Comparing (47) and (48), we see that sufficient conditions for i\ ync (and similarly 3 2 sync ) to be 
negative definite automatically imply that VJV T is negative definite. Indeed, since 7 > 0, 

V lj , 3l ync < => V lj , VJV T < (49) 

In other words, the basic requirement that the individual target dynamics are stable (as shown in 
the previous section) automatically implies stability of the synchronization mechanism itself. 

Note the opposite signs of 7 in Eqs (48) and (47). Intuitively, these express a key trade-off. 
Indeed, the stronger 7 is, the easier and stronger the synchrony of the memory state (Eq (47)). 
However, a stronger connection also makes the system less stable. This is expressed by the positive 
7 in Eq (48), which imposes stricter constraints on the permissible values of the other weights for 
J]y nc to remain negative definite. 

Synchronization of the two maps in this way allows reduction of the two coupled systems to 
a single virtual system with the additional parameter 7 for the coupling strength (Eq 47,48). 
Stability of this hybrid system guarantees stability of the synchronization mechanism itself (Eq 
49). The upper-bounds for 7 are thus (based on Eq 13) 

7 <2v^-a (50) 

As long as this condition is met, the dynamics of each map are contracting and their synchroniza- 
tion is stable. The lower-bound on 7 is determined by the minimal activity necessary to begin 
"charging" the second map (which gets no external input in our configuration). The minimal 
activity that a unit on the second map gets as input from the first map needs to be larger than its 
activation threshold T, i.e. 2^7 > T where Xi is the steady-state amplitude during the application 
of input (which is a function of the gain g = 1+ ^ 2 _ a ). Thus, 

T 



(51) 



2.8 Stability of unidirectionally coupled WTAs 

Next, we extend our analysis to networks consisting of 3 WTAs x, y and z of the kind shown in Fig 
2D and described in section 2.2. WTAs x and y are bidirectionally coupled to express the current 
state and are equivalent to the network considered in the previous sections. A further WTA z is 
added that contains units Zi, referred to as transition neurons (TNs). In this example, there are 
two TNs z 1 and z 2 (Fig 2D). Activation of the first (z{) leads the network to transition from state 
X\ to x 2 , if the network is currently in X\. Activation of the second (z 2 ) leaves the network in state 
x 2 , if the network is currently in this state. If it is not so, then no activity is triggered. The TN z\ 
is an example of a transition from one state to another. TN z 2 is an example of a transition that 
starts and ends in the same state (a loop). This loop is intentionally introduced here, because it 
poses a limit to stability. TNs receive and project input with weight > 0. 
The Jacobian of the full system consists of 3N variables: 
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Since there are two memory states, 
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P 1 describes the input and P 2 the output of the TNs. Here, 



P x = 


























































(54) 



(55) 



2.8.1 Case 1: loop 

For purposes of worst-case analysis, assume that the TN z 2 (which receives and projects to state 
2) is permanently active. This is achieved by setting T TN = 0. In this case, we require that the 
network remains synchronized in state z 2 . 

The state is stable with z 2 activated if the synchrony between x and y is not disrupted. This 
is the case if 

" (Zi + Z 4 )(a-7)-2 -Pi(h + U)~ 

rVJV T = (Z 2 + Z 5 )(a- 7 )-2 -ft(Z 2 + Z 5 ) (56) 

with V = [Ijv — I at Ojv] and J the Jacobian of the entire system (9 variables). 

Note the similarity to equation (47). None of the non-linearity terms of the 3rd WTA l 7 ,l s ,l 9 , 
nor 0, appear in this equation. Thus, the synchrony of the states is not influenced by the presence 
of a consistently active loop TN. The presence of does thus not influence the synchrony between 
x and y which represent the state. However, this combined system also needs to be contracting 
for this system to be stable (i.e. reach steady state). Thus, we next derive the limits on for this 
to be the case. 

Using the insight gained in section 2.7, we replace the yi terms by Xi terms for purposes of 
stability analysis. Note that the principle of showing synchronization first introduces a hierarchy 
(or series) of dynamic systems, so that the overall result converges if each step (sync and simplified 
system) does, with convergence rate the slowest of the two. In our case the synchronization step 
is always the fastest, so the overall convergence rate is that of the reduced system. 

Next, we analyze the stability of the reduced system (consisting of x and z). Here, only z 2 
(loop TN) is used, z\ is not connected. The corresponding Jacobian is: 
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(57) 



Having 3™ be negative definite in the metric ©, 

V lj , @3 TN @ 



< 



(58) 

guarantees that the coupled system is stable. Following (Wang & Slotine, 2005) and (Slotine, 
2003) (section 3.4), if the uncoupled systems are stable with contraction rates and X z , then the 
coupled system is stable if 



2 < 



(59) 
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and its contraction rate is 



V. = - /(^^ (60) 

Note that X x<z > is equivalent to condition (59). One then has < X x<z < X x < X z . Note 
that if the connection weights are not symmetric, in the expressions above can be replaced by 

= &±&. 

The contraction rate for a single WTA is equal to the absolute value of the largest eigenvalue 
of F s (its real part) (Wang & Slotine, 2005). Following (10), the contraction rate for a WTA (such 
as z) is X z = \Re{\{— 2 + a + a/ck 2 — 4/3 1 /3 2 ))|- Similarly, for a symmetrically coupled system with 
coupling weight 7, the contraction rate is X x = \Re{\{— 2 + a + 7 + a/(« + 7) 2 — 4/5i/3 2 )) |- These 
two conditions thus establish the upper-bound on the permissible weight of < y/X x X z . Since 
X x < X z , a good approximation is 



< A, (61) 



2.8.2 Case 2: transition 



Here, the transition from one pattern of synchrony to an other (two states) is investigated. For 
this purpose, both states x\ and x 2 exist. Also, the TN z\ is connected. Since activating z\ leads 
from a transition from state 1 to 2 (represented by X\ and x 2 ). In the following, we assume that 
the network is in X\ when initialized and z\ is active (i.e. T TN = 0). We then show that the 
network will end up in the second synchrony pattern, representing x 2 . 

Defining V as above and J the appropriate Jacobian of the full system yields 



rVJV T = 



(/i + / 4 )(«-7)"2 -A(Zi + Z 4 ) 

(Z 2 + Z 5 )(a- 7 )-2 -^(Za + Zg) 



(62) 



Similarly to equation (56), the terms of the 3rd WTA do not appear. Thus, activation of the 
TN does not disturb the synchrony between x and y as such but only which particular units 
synchronize (this is not visible in above equation). 

Whether or not the system transitions from one pattern of synchrony to another is determined 
by the threshold behavior of the activation function. As long as the input to x 2 , 4>z\ > for 
sufficient amount of time, the network will switch its pattern of synchrony. If, on the other hand, 
Z\ is not active for a sufficiently long time (relative to the contraction rate A x ), the system will 
return to the previous pattern of synchrony. Also note that Z\ switches off automatically as soon 
as the transition occurs (since then y\ ceases to be active). Thus, the timing of the external input 
does not have to be tightly connected with the external dynamics, or can even be permanently 
present. 

2.9 Verification by simulation 
2.9.1 Single WTA 

The properties of contraction during the application of external input can be demonstrated by 
numerical simulation of a single WTA network consisting of 2 excitatory and 1 inhibitory units 
(see Fig 2B for wiring). Figure 6 shows the dynamics of the 3 units in the network while external 
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Figure 6: Simulation of a single WTA network (Fig 2B). (A) Excitatory units and external input. 
(B) First derivative with respect to time for the excitatory units. The activity of 22 is offset by 
100 timesteps relative to X\ for plotting purposes only. (C) Activity of the inhibitory unit. (D) 
Derivative of the inhibitory unit. (E) The maximal eigenvalue Re(X max (Fg)) of the generalized 
Jacobian. (F) The minimal eigenvalue of \ m i n (V JV S ) . Activity is plotted in arbitrary units, x-axis 
is in units of integration timesteps in units of 1000 (Euler integration with A = 0.01). See the text 
for details. 
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Figure 7: Simulation of networks consisting of two (A-B) and three (C-D) recursively coupled WTA 
networks. (A-B) Simulation of two symmetrically coupled WTAs as shown in Fig 2D (with = 0). 
In this network, there are two 7 connections between the maps (between X\,yi and £2,2/2)- (A) 
shows the excitatory units, (B) the inhibitory units. (C-D) Simulation of three coupled WTAs, as 
illustrated in Fig 2D. Maps x and y are coupled symmetrically as in (A-B), whereas map z creates 
a unidirectional feedback loop between y and x (see Fig 2D for details). Shown is the activity of 
the excitatory units on x and y (C) as well as on the third map (z, panel D). Units of time are 
integration timesteps in units of 1000. See text for details. 
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Figure 8: Simulation of a randomly connected network consisting of 6 WTA modules. (A) Weight 
matrix of the entire network. (B) External inputs used for the simulation shown in C-D. (C) Color- 
coded diagram of the winner on each WTA. Winning units are either none (0) or 1-4 (indicated 
by the color, representing 0-4). The bottom row shows to which WTA external input is currently 
provided (1-6). (D) Plot of the activity of every unit in the network during application of the 
input shown in B. Note how on every WTA, only one unit can be active (after convergence). The 
color-code represents the activity of each unit, in arbitrary units. Units of time are integration 
timesteps in units of 1000. 
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input is applied to the two excitatory units x\ and x 2 - The input I(t) is a binary pulse of amplitude 
2.0 to x\ and 1.8 to X2 (difference 10%). Note how the network amplifies the small difference (27 
is the winner). Parameters were: a = 1.3, /3i = 2,02 = 0.25, T — 0,G — 1, i.e. the gain was 
g = 5 (steady-state amplitude g * I). These parameters satisfy all conditions for contraction. The 
properties of contraction are evaluated at every point of time by evaluating the effective Jacobian 
at this point of time. The maximal eigenvalue Re(X max (F s )) of the generalized Jacobian indicates 
whether the network is currently contracting or not. Whenever the values is below zero (dashed 
line), F s is negative definite. 

The minimal eigenvalue of X m i n (VJV s ) (Fig 6E) indicates points of time when the network is 
not contracting. Whenever the value is above the dashed line, VJV T is positive definite. Note the 
interplay of the dynamics of the maximal eigenvalue of the generalized Jacobian and the minimal 
eigenvalue of V JV T , which together reflect the dynamic state of the system. Whenever the system 
is contracting, X ma x(F s ) < 0. Shortly after the onset of the external input (t=1000), no winner has 
been selected and the system is not contracting. Instead it is diverging exponentially towards a 
contracting region of the state space. Note also the other important transition of the system after 
the offset of the input (t=6000). After a while only the inhibitory neuron is active, which explains 
the change around t=7000. 

2.9.2 Two-and three WTA networks 

Next we simulated two networks: one consisting of two (Fig 7A-B) and one consisting of 3 (Fig 
7C-D) recursively coupled WTAs, connected by either bidirectional or directional connections. 
Parameters were: a = 1.3, fi\ = 2.8, $2 = 0.25,7 = 0.15, T = 1. For the second simulation, 
in addition T TN = 5,0 = 0.3. These simulations (see the legend of Fig 7 for details) illustrate 
the dynamics of the network during persistent state dependent activity that exists due to the 
7 connections. Also, it shows how these states can be utilized to implement state dependent 
transitions (see Fig 2D for connectivity). 

Notice how, after application of external input to x±, activity persists (Fig 7A). After applying 
input to x 2 , the winner switches. Increasing inhibition (onset at t=15000, panel B) globally resets 
both maps. Note also how application of external input to Z\ leads to a transition from the first 
to the second state only if the network is in state x\ (case (1) vs (2) as indicated in Fig 7D). This 
illustrates a state dependent reaction of the network to external input. The third input application 
(case (3) in Fig 7D) illustrates the stability of the network, even if the transition is onto itself. 
Note how the network reaches a stable level of activity before the external input to z 2 is removed, 
indicating balanced inhibition/excitation despite multiple feedback loops both within and between 
maps. 

2.9.3 Large networks 

Contraction properties are particularly useful because they are preserved if several contracting 
systems are combined to form a larger system (Slotine, 2003; Lohmiller & Slotine, 1998). Such 
systems can be combined in several ways, including parallel combinations, hierarchies, and certain 
types of feedback to form a new system. The resulting composite system is guaranteed to be 
contracting as well if the underlying sub-systems are contracting (Slotine & Lohmiller, 2001; Slo- 
tine, 2003). Note that, in themselves, combinations of stable modules have no reason to be stable 
(Slotine & Lohmiller, 2001). This is only guaranteed if the constituting modules are contracting. 
To illustrate that contracting WTA networks can be used to construct larger networks, we next 
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simulated a network consisting of 6 identical WTAs (Fig 8). The purpose of this simulation is to 
demonstrate that satisfying contraction-properties at the level of the constituting modules (one 
WTA) is sufficient to guarantee stability of a large connected network with many (potentially 
unknown) feedback loops. 

Each WTA consists of 4 excitatory and 1 inhibitory unit. The first 4 WTAs represent states and 
remaining two state-dependent transitions. Bidirectional 7 connections were generated randomly 
between the excitatory units of these 4 WTAs. Unidirectional connections where placed randomly 
between units on the last 2 WTAs and units on the first 4 WTAs. Parameters were a = 1.3, 
fli = 3.2, 02 = 0.25, 7 = 0.15, T = 1, and <fi = 0.3. Inputs were generated randomly, first 
restricted to only the state-carrying WTAs and later only to the transition-inducing units. One 
instance of this simulation is shown in Fig 8. Note the following features of the result: i) the 
network is in a non-linear hard- WTA configuration: except for transients, only one excitatory unit 
in each WTA is active (Fig 8D). ii) The reaction to the same external input depends on the state 
of the network. For example, consider the inputs to WTA 5 in Fig 8C. Input to unit 1 (blue) is 
provided twice, but only once does this induce a state switch (visible as change of winners on the 
first WTAs). iii) levels of activity are strictly bounded, despite external input and high gain of 
the network (Fig 8D). Similar simulations with other weights (even randomized) are also stable, 
as long as the parameters are within the ranges permitted. 

Simulation of a networks consisting of 100 and 1000 WTAs, probabilistically connected as 
described above, were found to be stable as well. We assume the parameters of the WTA itself 
are static. The critical concerns are the connections between the WTAs (7 and 0). As long the 
pairwise connection probability between a given WTA and all other WTAs is sufficiently low, 
stability will be guaranteed. This is because the sum of all 7^ which connect to a particular unit 
needs to observe the constraint shown in Eq 50: ■ jj < 2yjf3\f32 — a. 

3 Discussion 

Biology is replete with examples in which relatively independent sub-systems are coupled together 
by various degrees of time-varying positive and negative feedback, and nevertheless the entire 
system is functionally stable. The neocortex, with its billion neurons parcellated into millions of 
interconnected local circuits is one striking example of such a system. We have chosen to study 
this cortical case, because the functional stability of the vastly interconnected cortical system is 
intimately associated with the expression of stable intelligent behavior. 

Our contribution toward understanding this intriguing phenomenon of collective stability, is 
that we have identified and properly analyzed a small functional module (here a WTA circuit) and 
showed that this knowledge alone allows us to guarantee stability of the larger system. By insta- 
bility we here mean anything that leads to run-away behavior, which in biological neurons means 
the neurons will latch into saturation and, if active long enough, die (Syntichaki & Tavernarakis, 
2003). We do not consider this seizure-like pathological state to offer 'boundedness' in any useful 
sense. Instead, it is the case that neurons usually operate at a small fraction of their maximum 
firing rate, and their networks are only stable because of tight inhibitory-excitatory balance. The 
brain pays considerable attention to maintaining this balance. For these reasons the stability of 
neuronal networks is better cast as stability in a network of (effectively) positively unbounded 
linear threshold neurons, than for example sigmoidal neurons. The simplified model neurons we 
used make the problem tractable, while preserving the features of real neurons that we believe are 
crucial to understand neuronal circuits and their stability. We thus expect our observations to be 
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directly applicable to spiking neurons. 

The strength of these results is that they cast light on the problem of how global stability 
can be achieved in the face of the locally strong feedbacks required for active, non-linear signal 
processing. We have examined this problem in networks of WTA circuits, because the WTA is a 
rich computational primitive, and because neuroanatomical connectivity data suggest that WTA- 
like circuits are a strong feature of the networks formed by neurons in the superficial layers of 
cortex. 

In essence, we have employed Contraction Analysis to demonstrate that a WTA network can at 
the same time be contracting and strongly amplifying. Our key results are the bounds documented 
in Eqs 23-25, 50, 61 and illustrated in Fig 3. It is important to note that this analysis could not 
have been performed with standard linear tools (such as eigenvalues of Jacobians at fixed points), 
which rely on linearizations and do not provide global stability conditions. While in principle 
the asymptotic convergence may have been demonstrated using a Lyapunov function (Slotine & 
Li, 1991), actually no such function is known for these kinds of networks. In contrast, using 
contraction analysis we could demonstrate exponential (as opposed to asymptotic) convergence for 
arbitrary initial conditions. In addition to systematic analysis, we have also confirmed our results 
with simulation of random systems composed of WTAs. The WTA network thus constitutes a 
strong candidate for a canonical circuit which can serve as a basis for the bottom-up construction 
of large biological and artificial computational systems. Our approach is similarly applicable to 
functional modules other than WTAs, such as liquid state machines with feedback (Maass, Joshi, 
& Sontag, 2007). 

To systematically analyze subsets of active neurons which should either be contracting or not 
(i.e. permitted winners or not) we utilized what we called U terms. The parcellation of the active 
subsets of the network using such terms can be regarded as a generalization of the approach of 
permitted and forbidden sets ("switching matrix") (Hahnloser et al., 2003). However, that previous 
approach is suitable only for fully symmetric networks, whereas the proper operation of a WTA 
network requires asymmetry of the inhibitory connections. Our approach is to exhaustively show 
for any possible subset of the U terms that it is either exponentially diverging or contracting. This 
way, the network as a whole is guaranteed to exponentially converge to one of the fixed points. 
The concept we developed can be applied at any point of time during the operation of the network. 
In particular it can be applied before the winner is known. Our reasoning indeed guarantees that 
a winner will be selected exponentially fast. 

Winner-take-all networks are representatives of a broad class of networks, where a number of 
excitatory units share a common inhibitory signal that serves to enforce competition (Dayan & 
Abbott, 2001; Amari & Arbib, 1977; Yuille & Geiger, 2003; Hertz et al., 1991; Tank & Hopfield, 
1986; Rabinovich et al., 2000; Ermentrout, 1992; Schmidhuber, 1989). There are many instances 
of networks that share this property, including various neural networks but also gene regulatory 
networks, in-vitro DNA circuits (Kim, Hopfield, & Winfree, 2004) and development. Competition 
enforced by shared inhibition among excitatory units is a principal feature of brain organization 
(Kurt et al., 2008; Baca, Marin-Burgin, Wagenaar, & Kristan, 2008; Tomioka et al., 2005; Pouille, 
Marin-Burgin, Adesnik, Atallah, & Scanziani, 2009; Buzsaki, 1984; Mittmann, Koch, & Hausser, 
2005; Gruber, Powell, & O'Donnell, 2009; Sasaki, Matsuki, & Ikegaya, 2007; Papadopoulou, Casse- 
naer, Nowotny, & Laurent, 2010) and our findings are thus directly applicable for reasoning about 
such circuits. WTA-type behavior has been experimentally demonstrated in a variety of brain 
structures and species including the mammalian hippocampus and cortex (Kurt et al., 2008; Baca 
et al., 2008; Mittmann et al., 2005; Gruber et al., 2009; Sasaki et al., 2007; Busse, Wade, & Caran- 
dini, 2009). Also, the existence of functional WTA circuits has been suggested based on strong 
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anatomical evidence in others, in particular 6-layer cortex (Binzegger et al., 2004; Tomioka et al., 
2005). 

Combining several WTA networks permits the implementation of computational operations 
which can either not be performed by a single WTA or which would require an unrealistically 
large WTA. In this paper, we show the constraints that need to be imposed on combinations of 
such WTAs such that the new combined system is guaranteed to be stable. As an illustration 
of the computational ability of a network consisting of several WTAs, we simulated a network 
of 3 WTAs coupled both using symmetric and asymmetric connections (Figure 7). This network 
demonstrates two crucial properties that combinations of WTAs permit: persistent activity in 
the absence of input and state-dependent reaction to external input. While originally designed 
to demonstrate the stability property, this is also a novel generalization of our previous state- 
dependent processing network (Rutishauser & Douglas, 2009) as well as the pointer-map approach 
(Hahnloser et al., 1999). Note that in contrast to the previous work, here all 3 WTAs are fully 
homogeneous (identical). The only modifications needed are to establish appropriate connections 
between the WTAs. In contrast, our previous networks consisted of one or several WTAs plus 
specialized units. This new and more generic version makes the networks more stable, as excitatory 
units can only exist as part of a WTA and thus always receive balanced inhibitory input. As in 
the original (Rutishauser & Douglas, 2009), this enhanced three WTA network is also capable of 
implementing any regular language (a state automaton). 

While we demonstrated our approach for WTAs, our approach is sufficiently general to reason 
systematically about the stability of any network, biological or technological, composed of networks 
of small modules that express competition through shared inhibition. For example, synthetic DNA 
circuits can perform computations, self-assemble and provide a natural way to enforce competition 
through shared inhibition (Kim et al., 2004; Rothemund, Papadakis, & Winfree, 2004; Adleman, 
1994). Both natural and synthetic gene regulatory networks depend on networks of stochastic 
chemical reactions, resulting in a system of many nested feedback loops. Robustness is thus 
a crucial issue (Soloveichik, 2009) and the notion of contracting modules to describe complex 
aggregates might provide crucial insights into such systems. These networks can also implement 
WTA-like computations (Kim et al., 2004). Self-assembly of synthetic DNA circuits and biological 
tissue in general relies on de-novo bottom-up construction, analogous to our notion of first building 
small contracting modules and then composing a system consisting of such modules. 

We expect that our results will have immediate application in interpreting the behavior of 
biological neural networks. For example, most synapses in the nervous system are both unreliable 
and plastic and so the postsynaptic effect elicited by given action potential is uncertain and time 
varying. Contracting systems and their aggregations remain stable under these constraints pro- 
vided the parameters remain within certain well-defined and rather broad ranges. Time-varying 
changes both in the input as well as the structure of the network is thus permitted within a broad 
range without endangering the stability of the system (Lohmiller & Slotine, 1998). This is partic- 
ularly important if the system is modifying itself through processes such as developmental growth, 
synaptic plasticity or adult neurogenesis. 

A key question in neuroscience is how a large system such as the brain can rapidly modify itself 
to adapt to new conditions. A possible solution that our results suggest is that some parts of the 
network (the modules, here WTAs) could be pre-specified whereas the connections between the 
modules are learned or modified. This would greatly reduce the required amount of learning or 
developmental growth processes. A key question is how rules of plasticity can be utilized to enable 
such learning. It is also an open question whether and how a given WTA can be systematically 
decomposed into a combination of smaller WTAs that still perform the same function. This 
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question is crucial because the realistic size of a given WTA is restricted by the size of the projection 
field of recurrent inhibition. Our results provide a framework for investigation of these important 
questions. 

4 Appendix 

4.1 Appendix A: Constraints for persistent activity 

The memory state requires Xi = while Xi > for the unit % that represents the last winner. At 
this point, some necessary constraints on the weights can be derived. Since Xi > 0, unit % is fully 
linear. The activity of this unit is described by 

TXi = Xi(a — G) — P\Xn — Tj = (63) 

Thus, Xi(a — G) = (3\x N + 7*. It follows that a > G for xi to be positive. Intuitively, 
a — G represents the effective recurrent input the unit receives after accounting for the load (which 
causes it to decay exponentially to zero in the absence of input). This effective recurrent input 
needs to be strong enough to account for the negative inhibitory input as well as the constant 
current subtracted by the threshold. Also, in the two-map case, 7 can be incorporated in this 
argument as a + 7 > G (this follows from the synchrony, see below). Experimental measurements 
in neocortex indicate that a > G (Douglas et al., 1995). The open-loop gain of such systems is > 1 
(sometimes substantially so). Thus, these networks are only stable because of balanced inhibition. 
The requirement to operate in the a > G posses strict requirements for stability. 

4.2 Appendix B: Notes on approximation of the activation function 

To motivate the lj terms, consider the smooth non-linearity f(x) = log(a+exp(b(x+c)) with a — 1, 
6=1 and c = 0. Then, x = ex ^^ +l ■ For large x, this is approximately equal to 1 and for negative 
x < approximately equal to 0. An approximate solution to the derivative is thus either or 1. 
Indeed, the sufficient stability conditions provided by the following analysis will be unchanged if 
the nonlinearity is replaced by a general sigmoid of slope within the interval [0, 1], corresponding 
to the dummy terms taking any time- varying value between and 1. In optimization, a frequently 
used approximation is p(x,a) = x + - log(l + exp(— ax)) (the integral of the sigmoid function) 
(Chen & Mangasarian, 1995). a > is a constant. Its first derivative is 1+ex ^_ ax ^ and its second 

derivative is " cx p(~ a:r ) _ derivatives are bounded between and 1 and this function can thus 

(l+exp(— ax)y 

be used as a approximation. The approach that we use in this paper is thus similarly valid for this 
and other smooth approximations (as long as the derivative is bounded <= 1). 

4.3 Appendix C: Time-constants 

All parameters given in the paper assume that the time-constant r is the same for inhibitory and 
excitatory units. Similar conditions can be derived if this is not the case. In particular, it is of 
biological interest to consider inhibitory time-constants 77 which are larger than the excitatory 
time-constants te- Taking possibly different time-constants into account, the bounds in Eqs 23-25 
become 
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l<a<2jl3 1 P2— + l-— (64) 

V T i T i 

) T -<^2<- (65) 

4 Ti Ti 

Note that the key variable is the ratio If te — Tj, the bounds reduce to Eqs 23-25. 
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